The TÜbİTAK-UEKAE statistical machine translation system for IWSLT 2007

نویسندگان

  • Coskun Mermer
  • Hamza Kaya
  • Mehmet Ugur Dogan
چکیده

We describe the TÜBITAK-UEKAE system that participated in the Arabic-to-English and Japanese-toEnglish translation tasks of the IWSLT 2007 evaluation campaign. Our system is built on the open-source phrasebased statistical machine translation software Moses. Among available corpora and linguistic resources, only the supplied training data and an Arabic morphological analyzer are used in the system. We present the run-time lexical approximation method to cope with out-of-vocabulary words during decoding. We tested our system under both automatic speech recognition (ASR) and clean transcript (clean) input conditions. Our system was ranked first in both Arabic-toEnglish and Japanese-to-English tasks under the “clean” condition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The tÜbİTAK-UEKAE statistical machine translation system for IWSLT 2008

In this study, the TÜBİTAK-UEKAE statistical machine translation system based on the open-source phrasebased statistical machine translation software, Moses, is presented. Additionally, phrase-table augmentation is applied to maximize source language coverage; lexical approximation is applied to replace out-of-vocabulary words with known words prior to decoding; and automatic punctuation insert...

متن کامل

The tÜBITAK-UEKAE statistical machine translation system for IWSLT 2009

We describe our Arabic-to-English and Turkish-to-English machine translation systems that participated in the IWSLT 2009 evaluation campaign. Both systems are based on the Moses statistical machine translation toolkit, with added components to address the rich morphology of the source languages. Three different morphological approaches are investigated for Turkish. Our primary submission uses l...

متن کامل

The TÜBITAK-UEKAE statistical machine translation system for IWSLT 2010

We report on our participation in the IWSLT 2010 evaluation campaign. Similar to previous years, our submitted systems are based on the Moses statistical machine translation toolkit. This year, we also experimented with hierarchical phrasebased models. In addition, we utilized automatic minimum error-rate training instead of manually-guided tuning. We focused more on the BTEC Turkish-English ta...

متن کامل

The TÜBİTAK statistical machine translation system for IWSLT 2012

We describe the TÜBİTAK submission to the IWSLT 2012 Evaluation Campaign. Our system development focused on utilizing Bayesian alignment methods such as variational Bayes and Gibbs sampling in addition to the standard GIZA++ alignments. The submitted tracks are the ArabicEnglish and Turkish-English TED Talks translation tasks.

متن کامل

The CASIA phrase-based statistical machine translation system for IWSLT 2007

This paper describes our phrase-based statistical machine translation system (CASIA) used in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2007. In this year's evaluation, we participated in the open data track of clean text for the Chinese-to-English machine translation. Here, we mainly introduce the overview of the system, the primary modules, th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007